Combining Multiple Knowledge Sources for Discourse Segmentation
نویسندگان
چکیده
We predict discourse segment boundaries from linguistic features of utterances, using a corpus of spoken narratives as data. We present two methods for developing segmentation algorithms from training data: hand tuning and machine learning. When multiple types of features are used, results approach human performance on an independent test set (both methods), and using cross-validation (machine learning).
منابع مشابه
Discourse Segmentation of Multi-Party Conversation
We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based algorithm combines knowledge about content using a text-based algorithm as a feature and about form using linguistic and acoustic cues about topic shifts extracted from speech. This segmentation algorithm uses automatically induced decision rules to combine the different features. The embedded...
متن کاملDiscourse Segmentation by Human and Automated Means
The need to model the relation between discourse structure and linguistic features of utterances is almost universally acknowledged in the literature on discourse. However, there is only weak consensus on what the units of discourse structure are, or the criteria for recognizing and generating them. We present quantitative results of a two-part study using a corpus of spontaneous, narrative mon...
متن کاملCombining data mining and group decision making in retailer segmentation based on LRFMP variables
Data mining is a powerful tool for firms to extract knowledge from their customers’ transaction data. One of the useful applications of data mining is segmentation. Segmentation is an effective tool for managers to make right marketing strategies for right customer segments. In this study we have segmented retailers of a hygienic manufacture. Nowadays all manufactures do understand that for st...
متن کاملCombining Multiple Knowledge Sources for Dialogue Segmentation in Multimedia Archives
Automatic segmentation is important for making multimedia archives comprehensible, and for developing downstream information retrieval and extraction modules. In this study, we explore approaches that can segment multiparty conversational speech by integrating various knowledge sources (e.g., words, audio and video recordings, speaker intention and context). In particular, we evaluate the perfo...
متن کاملAutomated Video Segmentation for Lecture Videos: A Linguistics-Based Approach
Video, a rich information source, is commonly used for capturing and sharing knowledge in learning systems. However, the unstructured and linear features of video introduce difficulties for end users in accessing the knowledge captured in videos. To extract the knowledge structures hidden in a lengthy, multi-topic lecture video and thus make it easily accessible, we need to first segment the vi...
متن کامل